Universal Access To All Knowledge
Home Donate | Store | Blog | FAQ | Jobs | Volunteer Positions | Contact | Bios | Forums | Projects | Terms, Privacy, & Copyright
Search: Advanced Search
Anonymous User (login or join us)
Upload

Reply to this post | See parent post | Go Back
View Post [edit]

Poster: Web-Designer-2008 Date: Oct 19, 2008 9:33am
Forum: web Subject: Re: Please delete the following sites

As I mentioned in my initial post, I have sold the websites, robots.txt will do me no good.

Please read my initial post and delete the websites:

dalmatia.net and vukovar.com

Thank you.

Reply to this post
Reply [edit]

Poster: kustota Date: Oct 19, 2008 9:53am
Forum: web Subject: Re: Please delete the following sites

repeating your requests here will do you no good for sure. please read the faq so you will know what you can do to delete the archives. if the new owners are interested in this, they simply should use robots.txt themselves.

Reply to this post
Reply [edit]

Poster: Face_ Date: Oct 19, 2008 9:47am
Forum: web Subject: Re: Please delete the following sites

The IA won't just remove a website like that, because they can not be sure if you are the person who you say you are. If the new owners want them removed, then can't you relay this thread to them? Altering the robots.txt should be very simple to do.

Reply to this post
Reply [edit]

Poster: naaier Date: Nov 10, 2008 8:53am
Forum: web Subject: Re: Please delete the following sites

That is definately the wrong Question to ask!

If AI wants to save other peoples Websites and make all versions available on THEIR Website, THEY should be the one asking for permissions BEFORE doing it. I didnt even know about archive.org at the time.

Why should I have to waste Time to take a copy of my site down that I deleted years ago. And the Website doesnt mention anywhere how to get our websites off from AI. Forget that robots.txt crap. Aint no use when the Website is offline. Besides, there werent a lot of crawlers around at the time, and people didnt know about such things.

So to me, AI is like SPAM, but worse. I didnt want it, now I got it, and cant get rid of it... I hope the idiots from AI are proud of themselves! I hope they get a few HDD crashes and their Backup goes up in Fire. I am honestly pissed because crap like this should be illigal and it probably is, but since its not worth the time, nothing will happen.

Reply to this post
Reply [edit]

Poster: Web-Designer-2008 Date: Nov 13, 2008 8:26am
Forum: web Subject: Re: Please delete the following sites

I have informed the present owner to add in their robots.txt file the following:

User-agent: ia_archiver
Disallow: /

How long does it take for the archives to be removed?

Thank you.

Reply to this post
Reply [edit]

Poster: kustota Date: Nov 13, 2008 9:38am
Forum: web Subject: Re: Please delete the following sites

after the new owner changes robots.txt, the sites must be crawled, and within a day after a crawl, archives should be removed. when you are sure that robots.txt are updated (they are not yet), you can enter your sites' URLs into 'Archive That!' form (http://www.archive.org/web/web.php) to speed up the process.

This post was modified by kustota on 2008-11-13 17:38:38

Reply to this post
Reply [edit]

Poster: Web-Designer-2008 Date: Nov 13, 2008 9:58am
Forum: web Subject: Re: Please delete the following sites

Thank you, I just checked it. Robots.txt does work.

The only question on the response when I query my old site:

"We're sorry, access to http://zzz.com/* has been blocked by the site owner via robots.txt."

Does this mean that even though the access to the site archive is blocked, the old archive is still stored at Archive.org?

Thank you.

Reply to this post
Reply [edit]

Poster: Web-Designer-2008 Date: Nov 13, 2008 9:09pm
Forum: web Subject: Re: Please delete the following sites

Thank Face_ and kustota.

zzz.com was just an example, not a real site. Maybe it is :-)

It would be interesting to know for sure if there is any way for the owner to ensure that IA deletes all his archives on request, not just blocks them.



Reply to this post
Reply [edit]

Poster: kustota Date: Nov 13, 2008 9:55pm
Forum: web Subject: Re: Please delete the following sites

i don't think there is a reason to prevent some third party to keep archived copies of your site. i don't even see any legal grounds that will allow you to prevent them do it.
and what if your site changes hands? do you want the new owner to have such rights, to erase all of your work completely?

Reply to this post
Reply [edit]

Poster: protheus Date: Dec 24, 2008 5:33am
Forum: web Subject: Re: Please delete the following sites

@kustota:
" ... and what if your site changes hands? do you want the new owner to have such rights, to erase all of your work completely?"

A person who sends a deletion request wants JUST THAT: to erase all of his work from your archive. ;-)

As soon as the requester´s identity is verified, you should DELETE all concerning data, not only block access.


This post was modified by protheus on 2008-12-24 13:33:41

Reply to this post
Reply [edit]

Poster: Face_ Date: Nov 13, 2008 10:29am
Forum: web Subject: Re: Please delete the following sites

Ehm... I don't get that message when I check www.zzz.com. I see quite some copies:

http://web.archive.org/web/*/http://www.zzz.com/

The site also currently does not seem to have a robots.txt (www.zzz.com/robots.txt), even though it once had one (web.archive.org/web/*/http://www.zzz.com/robots.txt).

vukovar.com seems to be blocked now (web.archive.org/web/*/http://www.vukovar.com), but dalmatia.net seems not(web.archive.org/web/*/http://www.dalmatia.net).

As for your question, well, good question! Does kustota know the answer?

Reply to this post
Reply [edit]

Poster: kustota Date: Nov 13, 2008 6:37pm
Forum: web Subject: Re: Please delete the following sites

as i understand, when there is a robots.txt block, your site is not crawled and becomes inaccessible in wayback machine. i guess the old versions of the site (before the block) are stored somewhere in alexa's archives. i might be wrong.